FHM: Faster High-Utility Itemset Mining Using Estimated Utility Co-occurrence Pruning
نویسندگان
چکیده
High utility itemset mining is a challenging task in frequent pattern mining, which has wide applications. The state-of-the-art algorithm is HUI-Miner. It adopts a vertical representation and performs a depth-first search to discover patterns and calculate their utility without performing costly database scans. Although, this approach is effective, mining high-utility itemsets remains computationally expensive because HUI-Miner has to perform a costly join operation for each pattern that is generated by its search procedure. In this paper, we address this issue by proposing a novel strategy based on the analysis of item co-occurrences to reduce the number of join operations that need to be performed. An extensive experimental study with four real-life datasets shows that the resulting algorithm named FHM (Fast High-Utility Miner) reduces the number of join operations by up to 95 % and is up to six times faster than the state-of-the-art algorithm HUI-Miner.
منابع مشابه
Optimized High-Utility Itemsets Mining for Effective Association Mining Paper
Received Jan 14, 2017 Revised Jun 7, 2017 Accepted Sep 11, 2017 Association rule mining is intently used for determining the frequent itemsets of transactional database; however, it is needed to consider the utility of itemsets in market behavioral applications. Apriori or FP-growth methods generate the association rules without utility factor of items. High-utility itemset mining (HUIM) is a w...
متن کاملA New Algorithm for High Average-utility Itemset Mining
High utility itemset mining (HUIM) is a new emerging field in data mining which has gained growing interest due to its various applications. The goal of this problem is to discover all itemsets whose utility exceeds minimum threshold. The basic HUIM problem does not consider length of itemsets in its utility measurement and utility values tend to become higher for itemsets containing more items...
متن کاملFHM + : Faster High-Utility Itemset Mining Using Length Upper-Bound Reduction
High-utility itemset (HUI) mining is a popular data mining task, consisting of enumerating all groups of items that yield a high profit in a customer transaction database. However, an important issue with traditional HUI mining algorithms is that they tend to find itemsets having many items. But those itemsets are often rare, and thus may be less interesting than smaller itemsets for users. In ...
متن کاملEFIM: A Highly Efficient Algorithm for High-Utility Itemset Mining
High-utility itemset mining (HUIM) is an important data mining task with wide applications. In this paper, we propose a novel algorithm named EFIM (EFficient high-utility Itemset Mining), which introduces several new ideas to more efficiently discovers high-utility itemsets both in terms of execution time and memory. EFIM relies on two upper-bounds named sub-tree utility and local utility to mo...
متن کاملA Survey on Mining High Utility Itemsets from Transactional Databases
Mining high utility itemsets from a transactional database refers to the discovery of itemsets with high utility like profits. Frequent itemset mining (FIM) is one of the most fundamental problems in data mining. In this work, we propose a novel strategy based on the analysis of item co-occurrences to reduce the number of join operations that need to be performed (FHM: Faster High-Utility Miner...
متن کامل